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How to Build a FreeBSD Kernel 
Module From Scratch 


This workshop was designed to help you understand how the userland 
communicates with the kernel through an existing example, studying the 
workflow; hence in the end you would be able to extend it or writing one of your 
own. 


Module 1: FreeBSD Kernel Module 


In this module, we will give an overview of the nature of the FreeBSD’s kernel. 
The important configuration files will be explained in addition to learning how to 
compile the whole system with more options and with more debugging 
information enabled. This is very useful for kernel development. 


Module 2: IPFW2 Userland and Kernel Workflow 


In this module, we'll have an overview of ipfw2 - both userland and kernel side - 
and how they both interact. 


Module 3: Through The Userland to Kernel Codes 


In this module, we'll have an overview of ipfw2 - both userland and kernel side -, 
and how they interact. First of all, we will see how to use sysctl we saw in 
previous modules to set simple values. How to communicate settings to the 
kernel via a socket; all of it going through the userland to kernel codes. 


Module 4: DUMMYNET Module Workflow Study 


In this last module, we’ll not only look at ipfw’s communication with the kernel 
but also how the firewall configuration and rules are handled. We will go through 
the dummynet module, its workflow and how it operates with the kernel so you 
would be able to add new opcodes on your own. 
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Module 1: FreeBSD 
Kernel Module 


In this module, we will give an overview of the nature of the FreeBSD’s kernel. The important 
configuration files will be explained in addition to learning how to compile the whole system with more 
options and with more debugging information enabled. This is very useful for kernel development. 


Requirements: 
FreeBSD 10.x. 
Machine with at least 4 cores is recommended for the system compilation. 


Genuine hardware or virtualized environment as your convenience. 
1/ The FreeBSD Kernel 


FreeBSD, like many kernels, is a monolithic kernel with loadable module support. It is possible to build 
FreeBSD kernel with all needed modules statically, or for those modules that support it, as separated 
dynamic loadable modules. 


Dynamic modules can be loaded and unloaded at will with kldload/kldunload or at boot time with 
<name of kernel module>_load=“YES” in /boot/loader.conf file. 


To have an overview of all currently loaded modules, you can type kldstat. For example, the output 
looks like the following: 


Id Refs Address Size Name 
1 19 OxftfELELEEBOZ0O0O00 LSEESTS kernel 


2 A ORETEEETICel Sl 1000 4i62 mg ube.ko 


oO. 1 OSTTELEIEES les b000' sbab mg socket. ko 


Let’s load the DTrace module by typing: 


kldload dtraceall 


Then if we type kIdstat again, we should see some new entries related to this module: 


10 1 OxffLLfELL8le5f000 89e dtraceall.ko 
11 11 OxftfffffffT81le60000 9964 opensolaris.ko 


12 10 Oxffffffff8leea000 857dba dtrace.ko 


lf we add dtraceall_load=“YES” to /etc/rc.conf, we can use Dtrace framework facility after reboot. You 
can find an excellent introduction to Dtrace in the December 2016 issue of BSDmag. 


(http://osdmag.org/download/samba-nfs-and-firewall-new-bsd-issue/) 


Indeed, Dtrace can be useful for tracing syscalls. 
2/ Configuration 


To build the system, we need the source code for both the kernel and userland. The userland is simply 
all the base utilities of FreeBSD. The kernel and userland code are consistently tied together, and 
available in the same subversion repository. Apart from pure BSD code, we can find GNU libraries and 
software (called contrib code). In addition, for ZFS, CTF (Compact C Type Format debug section, 
similar to DWARF format but reduced in terms of size) and DTrace proper compilations, some CDDL 
codes are present. Happily, FreeBSD is provided with subversion in base, suffixed distinctly to avoid 
colliding with the port version. 


checkout the source in /usr/src via svnlite as a privileged user 


sudo (or as root) svnlite co https://svn0.us-east.freebsd.org/base/stable/10 /usr/src (or you can 
checkout the current branch with much newer code but with more instability, you can just replace 
stable/10 by head) 


To better understand how the kernel options work, we will have a look at the 
/usr/src/sys/conf/options file. 


Indeed, this file serves to indicate in the kernel level which symbols/constants from a certain header file 
we want to include in the build process. 


it 


SFreeBSDS 


Format of this file: 


Option name filename 


If filename is missing, the default is 


Opt. <name-of-option-in-lLower-Ccase>h 


AAC DEBUG opt_aac.h 


AACRALD: DEBUG Opt aacraid. 


AHC ALLOW MEMIO opt aic7xxx.h 


AHC_TMODE ENABLE opt_aic7xxx.h 


AHC DUMP EEPROM opt aic7xxx.h 


AHC DEBUG Opt. aic7xxx.h 


AHC DEBUG “OPTS opt alec/axxah 


AHC_REG PRETTY PRINT opt_aic7xxx.h 


AHD DEBUG opt _aic79xx.h 


AnD. DEBUG OPTS opt. aic/9oxx.h 


AHD TMODE ENABLE opt_aic79xx.h 


AHD REG PRETTY PRINT opt _aic79xx.h 


ADW ALLOW MEMIO opt_adw.h 


There are up to two fields on each line: the option’s name and the file created with the relevant 
preprocessor defined. If the option is present in your kernel configuration file, let’s say 

AHD DEBUG OPTS, it is possible to test if AHD DEBUG OPTS is defined and providing some contextual 
code for this option. 


Let’s imagine we wrote a new shiny kernel module. We could add our proper line in this file. 


BSDMAG opt _bsdmag.h 


Another important file is /usr/src/sys/conf/files. This file serves to indicate which kernel 
module needs to be included in the build process. 


cam/cam.c optional scbus 

cam/cam_ compat.c optional scbus 
cam/cam periph.c optional scbus 
cam/cam_ queue.c optional scbus 


cam/cam_ sim.c optional scbus 





cam/cam_ xpt.c optional scbus 
cam/ata/ata_all.c optional scbus 
cam/ata/ata_xpt.c optional scbus 
cam/ata/ata_pmp.c optional scbus 
cam/scsi/scsi_xpt.c optional scbus 
cam/scsi/scsi_all.c optional scbus 
Cam/SCS1/SCsi -Cd.c Optional..cd 
Gal/ SCSi/Ses1. chs Optional. ch 
cam/ata/ata_da.c optional ada | da 
Cam/ Chl /ctl<e optional. ert 


cam/ctl/ctl_ backend.c optional ctl 





cam/ctl/ctl backend block.c optional ctl 


cam/ctl/ctl_ backend _ramdisk.c optional ctl 
cam/ctl/ctl_ cmd table.c optional ctl 
cam/ctl/ctl_ frontend.c optional ctl 
cam/ctl/ctl frontend cam sim.c optional ctl 
cam/ctl/ctl frontend internal.c optional ctl 
Cam/Ctly etl: frontend 1stsiac optional -ctl 
Cam/Cll/ Cul. Sesial lee. optroned «cel 
Camyctil/ctl tpewc optional ctl 
Gam/Gtl/ ctl. tpe local se optional ctl 
cam/Chl/ etl -error.c optional. etl 
Camel eee a ec Op eTonal “enl 
Gam/CLl/ sess. -ctl.c optioned, ctl 
cam/scsi/scsi_da.c optional da 
Cam/SCS1/ SCSI iow sc Optional sce. |! nev i) nsp- |. Steg 
cam/scsi/scsi_ pass.c optional pass 


Gam/SCS1L/SCS1° ply¢ “Optional. pt 





Gah/SeSi/SCsi. Sasec-optional sa 


Each line has the following: the relative path to sys and the type of module. If type is optional; the 
module will be compiled with the (lower case) option name written afterwards. 


Again, with our new module, we can add our specific kernel module C file. For example, for file 
workshop modulel.c: 


workshop bsdmagmodulel.c optional bsdmag 
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3/ Build 


SO now we Can create a custom kernel config. Let’s call it WORKSHOP. 
cp /usr/src/sys/<arch>/conf/GENERIC /usr/src/sys/<arch>/conf/WORKSHOP 


echo “KERNCONF=WORKSHOP” >> /etc/make.conf (it will pick up the new WORKSHOP 
configuration file, by default it is the GENERIC one) 


Steps to build a system: 


First, the userland needs to be compiled. 


Goto /usr/src 


If your machine has multiple cores, it is advised to use them for the system compilation. 


This might take several hours depending on your current configuration. 


Build the userland: 


make -j<number of corest+1l> buildworld 


Build the kernel: 


make -j<number of corest+l> buildkernel 


Install the kernel: 


make installkernel (install the kernel in /) 


It is possible to do the following to build userland and the kernel in the same commana: 


make -j<number of corest+l> buildworld kernel 
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Restart in single user mode (via command line shutdown -r or via the boot menu) 
go to /usr/sre 


Then run merge master to merge configuration and other files from the current system with the new 
ones that were built. 


mergemaster =—p 
Then, install: 

make installworld 
then: 

mergemaster -FU1 


Mergemaster will try to merge various configurations files and asking you how you wish to proceed, 
merging as possible, replacing with a newer one or keeping the existing file. Warning, doing so the 
system will attempt to merge config files in various places, especially regarding potential ssh, 
user/groups related, a bit of caution not deleting additional settings/users/groups in the process, merge 
master will always ask you how do you plan to merge via your preferred editor. 


Restart in normal mode. 


You should have now a workable system with the latest fixes/patches for the 10.x branch. But as a 
developer, we might need more info from the system for debugging, studying the core dump after a 
system crash/kernel panic. It is advisable to enable kernel core dump writing (could be enabled when 
you installed FreeBSD or, afterwards, can be enabled via the dumpdir rc.conf variable) at the cost of 
disk space consumption (can be potentially important, deleting old ones is necessary.). They are, by 
default, located in /var/crash. To debug a kernel crash dump, the kernel compiled with debugging 
symbols, kernel.debug, is necessary. gdb can be used in the following way. 


kgdb 

GNU gdb 6.1.1 [FreeBSD] 

Copyright 2004 Free Software Foundation, Inc. 

GDB is free software, covered by the GNU General Public License, and you are 
welcome to change it and/or distribute copies of it under certain conditions. 
Type "show copying" to see the conditions. 

There is absolutely no warranty for GDB. Type "show warranty" for details. 


This GDB was configured as "amdo64-marcel-freebsd"... 
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Reading symbols from /boot/kernel/ng ubt.ko.symbols...done. 
Loaded symbols for /boot/kernel/ng_ ubt.ko.symbols 

Reading symbols from /boot/kernel/netgraph.ko.symbols...done. 
Loaded symbols for /boot/kernel/netgraph.ko.symbols 

Reading symbols from /boot/kernel/ng hci.ko.symbols...done. 
Loaded symbols for /boot/kernel/ng hci.ko.symbols 

Reading symbols from /boot/kernel/ng bluetooth.ko.symbols...done. 
Loaded symbols for /boot/kernel/ng_ bluetooth.ko.symbols 

Reading symbols from /boot/kernel/ng 12cap.ko.symbols...done. 
Loaded symbols for /boot/kernel/ng 12cap.ko.symbols 

Reading symbols from /boot/kernel/ng btsocket.ko.symbols...done. 
Loaded symbols for /boot/kernel/ng btsocket.ko.symbols 

Reading symbols from /boot/kernel/ng socket.ko.symbols...done. 
Loaded symbols for /boot/kernel/ng_socket.ko.symbols 

Reading symbols from /boot/kernel/dtraceall.ko.symbols...done. 
Loaded symbols for /boot/kernel/dtraceall.ko.symbols 

Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. 
Loaded symbols for /boot/kernel/opensolaris.ko.symbols 

Reading symbols from /boot/kernel/dtrace.ko.symbols...done. 
Loaded symbols for /boot/kernel/dtrace.ko.symbols 

Reading symbols from /boot/kernel/dtmalloc.ko.symbols...done. 
Loaded symbols for /boot/kernel/dtmalloc.ko.symbols 

Reading symbols from /boot/kernel/dtnfscl.ko.symbols...done. 
Loaded symbols for /boot/kernel/dtnfscl.ko.symbols 

Reading symbols from /boot/kernel/fbt.ko.symbols...done. 

Loaded symbols for /boot/kernel/fbt.ko.symbols 


Reading symbols from /boot/kernel/fasttrap.ko.symbols...done. 
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Loaded symbols for /boot/kernel/fasttrap.ko.symbols 

Reading symbols from /boot/kernel/lockstat.ko.symbols...done. 
Loaded symbols for /boot/kernel/lockstat.ko.symbols 

Reading symbols from /boot/kernel/sdt.ko.symbols...done. 
Loaded symbols for /boot/kernel/sdt.ko.symbols 

Reading symbols from /boot/kernel/systrace.ko.symbols...done. 
Loaded symbols for /boot/kernel/systrace.ko.symbols 

Reading symbols from /boot/kernel/systrace freebsd32.ko.symbols...done. 
Loaded symbols for /boot/kernel/systrace freebsd32.ko.symbols 
Reading symbols from /boot/kernel/profile.ko.symbols...done. 
Loaded symbols for /boot/kernel/profile.ko.symbols 


#0 sched switch (td=0xfffFf8011b80a940, newtd=<value optimized out>, 
flags=-2123250552) at /usr/src/sys/kern/sched ule.c:1940 


LO40> cpuid.="-PCPU GET (epuld) 3 


Like the userland gdb’s counterpart, we can use backtrace (bt). 


(kgdb) backtrace 


#0 sched switch (td=0xfffFf8011b80a940, newtd=<value optimized out>, 
flags=-2123250552) at /usr/src/sys/kern/sched ule.c:1940 


#1 OxffffffF£F8095b139 in mi switch (flags=Unhandled dwarf expression opcode 
Ox93 


) at /usr/src/sys/kern/kern synch.c:492 


#2 OxffffffFF8099b172 in sleepq switch (wchan=<value optimized out>, 
pri=<value optimized out>) at /usr/src/sys/kern/subr sleepqueue.c:552 


#2. OSEEELLELP S099 atds an sleepq wait. <wchan=VUsErErreolis3sie200, pri=Unhandled 
dwarf expression opcode 0x93 


) at /usr/src/sys/kern/subr sleepqueue.c:631 
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#4 Oxffffffff8095aa47 in sleep (ident=0x0, lock=0xfffff80115316230, 
priority=0, wmesg=Oxffffffffso0ff47f2 "-", sbt=0, pr=0, flags=<value optimized 
out>) at /usr/srce/sys/kern/kern synch.c:254 


#5 OxfffffffFfF8099Ff778 in taskqueue thread loop (arg=<value optimized out>) at 
/usr/src/sys/kern/subr taskqueue.c:118 


7o OXETEELE TE CUCL e234. am Tork exit wal lour=O0xtErErrErrcos9robu 
<taskqueue thread loop>, arg=Oxfffftts00078ebe90, frame=Oxftffre0232e9fac0) at 
/usr/src/sys/kern/kern fork.c:996 


#7 Oxfffffftfrfsod4f4fe in fork trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:610 


#8 OxO0000000000000000 in ?? () 


Also, list if we want to see 


(kgdb) list *Oxffffffff8095aa47 (coming from the frame number 4 from above) 
Oxffffffff8095aa47 is in sleep (/usr/src/sys/kern/kern synch.c:254). 

249 else if (sbt != Q) 

250. Eval = sleepg Timedweit(ldent, pri)? 

251 else if (catch) 

22 tvel > Sslespg wait sig (rdent;.pri)3 

253 else { 

254 sleepgq wait(ident, pri); 


255: sazal’ = 0% 


257 #ifdef KTRACE 

258 if (KTRPOINT (td, KTR CSW) ) 

Then, for example, going up in the stack frames calls and so on ... 
(kgdb) up 2 


#4 OREEELTIEPSOVSaao 7 An. sleep (ident=0x0, “Lock=(CxtirEetreo003sscba30, 
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priority=0, wmesg=OxffffftfffTsorff47£f2 "-", sbt=0, pr=0, 
flags=<value optimized out>) at /usr/src/sys/kern/kern synch.c:254 
254 sleepq wait (ident, pri); 

(kKodb) Jist 

249 else if (sbt != 0) 

250 rval = sleepgq timedwait (ident, pri); 

251 else if (catch) 

252 Eval: = sleepg wait. sig (ident, pri); 

253 else { 

294 sleepgq wait(ident, pri); 

255 rval = 0; 

256 } 

257 #ifdef KTRACE 


298 if (KTRPOINT (td, KTR CSw) ) 


For more information about gdb, a good exists at: 


http://bsdmag.org/course/application-debugging-and-troubleshooting-2 


If you run the -CURRENT branch, the kernel can crash for various reasons, and the gdb-like tool is 
handy to get a basic understanding of the reasons for the crash. 


Detecting potential deadlocks. 


FreeBSD does not rely on the Giant Lock model anymore. Instead, it has fine-grained locking/unlocking 
process. Hence, the resulting programming can be tricky and it is easy to get lock contentions. As a 
kernel developer, you can always enable the WITNESS* kernel options for detecting contentions and 
locks circular references; but beware that the system becomes pretty slow without 
WITNESS_SKIPSPIN (skip spin locks basically) is not activated. 


Having read this workshop module, you learned the basics of kernel custom configuration, compiling 
the whole system. 
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Exercise 


To enable those additional capabilities, deadlock detections, more debugging info, and additional 
checking, which options in conf/options are needed? 


« Recompile the kernel with those options. 


- Once the system is restarted, which differences are you noticing? Eventually, what are the 
downsides? 


- Is it possible to improve the situation? How? 
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Module 2: IPFW2 
Userland and Kernel 
Workflow 


In this module, we'll have an overview of ipfw2 - both userland and kernel side - and how they both 
interact. 


1/ IPFW command line settings via sysctl 


We can find it under the FreeBSD's source code we got from svn in the first module. 


<source code rooth path>/sbin/ipfw/ 


In the previous module, we saw the base of a kernel module. IPFW works differently. It enables/disables 
features via sysctl. Those who have done some FreeBSD programming might have used it, so syscalls 
like sysctl / sysctIlbyname / sysctInametomib are already familiar, and you can jump directly to the next 
chapter. 


Otherwise, IPFW uses sysctlbyname and sysctl. Here are their function signatures: 


int sysctlibyname (Const. char “name, void *oldp, size t *oldilenp, const void 


*newp, size t newlen); 


int: sysctl (const int “name, U Unt Memelen, void. “oldp, size t olalenp, const 


void *newp, size t newlen); 


If you wish to get a value, the oldp and oldlenp arguments need to be used. To set a value, use the 
newp and newlen arguments. 
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For example, to get the number of CPUs available: 
int nbcpu; 


size = mbcpulen = -Sir7eo0r (nbepu):; 


if (sysctlbyname (“hw.ncpu”, &nbcpu, &nbcpulen, NULL, 0) == 0) 


Printi( sd: cpus wn"; nbepa) 


Alternatively: 


TG. “Mia |Z | 


mib[0] = CTL HW; 


mib[1] = HW _NCPU; 


if (sysctl(mib, sizeof(mib), &nbcpu, &nbcpulen, NULL, 0) == 0) 


Or the opposite, setting a value, like the number of maximum file descriptors: 


int maxfiles = 4096; 


S26. 6 maxfileslen = sizeof (maxfiles); 


if (sysctlbyname (“kern.maxfiles”, NULL, O, &maxfiles, maxfileslen) == 0)... 


mib[0] = CTL KERN; 


mib[1] = KERN MAXFILES; 


if (sysctl(mib, sizeof(mib), NULL, O, &maxfiles, maxfileslen) == 0) 
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IPFW uses the sysctl* family function to turn on and off the firewall itself or to make ipfw more verbose. 


Here is the code where the firewall is enabled/disabled: 


/* EXPLANATION: av here are the parameter passed by command line */ 
} else if ( substremp(*av;. "firewall™). ==): 4 
sysctlbyname ("net.inet.ip.fw.enable", NULL, 0, 
&éwhich, sizeof (which) ) ; 
sysctlbyname ("net.inet6.ip6.fw.enable", NULL, 0, 


&éwhich, sizeof (which) ) ; 


2/ IPFW command line settings via socket 


In addition, ipfw uses a socket to, for example, add a rule via an identified optname. 


Here is a sample of the code responsible for getting settings from the kernel: 


Slee Airc 

table do modify record(int cmd, ipfw obj] header *oh, 
ipfiw ob] -tentry *tent,; int count; int atomic) 

{ 

IpEW. Ob y-Chly “ct lay; 

ipfw obj tentry *tent base; 

caddr. ©. pbur; 


Char -xbut [si 2eoL (on), re sizect(ipiw Ob] culy) + sizeof (tent): |4 
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iit error; a7 


Size LSz7 


error = do _get3(cmd, &0h->opheader, &sz) ; 


int 
do. get3 (int optname, ip fw3 opheader *op3, size t *optlen) 
{ 


int error; 


if ({COVEesc. Only) 


return (0); 


if, (ipiw socket ==: 1) 


ipfw_ socket = socket (AF_INET, SOCK RAW, IPPROTO RAW); 


it (ipiw- socket. <0) 


err (EX UNAVAILABLE, "socket"); 


op3->opcode = optname; 


error = getsockopt (ipfw_socket, IPPROTO IP, IP_FW3, op3, 


(socklen_t *)optlen) ; 


return (error); 
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An ipfw3_opheader structure needs to be passed, here is its raw definition ... 


typedef struct ipfw3 opheader { 


uintl6 t opcode; 


(Operation identifier) 


uintl6 version; 


padding 


lpfw3 command list sample (sys/netinet/ip_fw.h): 


#define 
#define 


#define 
(deprecated) */ 


#define 
#define 


#define 


#define 


sopts/versions */ 


IP FW TABLE XADD 86 


IP FW TABLE XDEL 87 








IP FW TABLE XGETSIZE 88 


IP FW TABLE XLIST 89 


IP FW TABLE XDESTROY 90 








IP FW TABLES XLIST 92 


IP FW DUMP SOPTCODES 116 


/* 


/* 


/* 


/* 


add entry */ 
/* delete entry */ 


get table size 


/* list table contents */ 
destroy table */ 


pe TSE Ell tables. ~%/ 


Dump available 


And here is the list of available opcodes (aka ipfw rules representation): 


enum ipfw_ opcodes { 


O NOP, 
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OIF SRE; /*® W32\= IP 

















ey. 
O IP SRC_MASK, /* ip = IP/mask ey 
O IP SRC _ ME, /* none 

i 
O TPISRC SET, /* u32=base, argl=len, bitmap */ 
O. TP DST, /* 132 = IP 
ay 
O IP DST MASK, /* ip = IP/mask * 7, 
O. LP ODS], May /* none 

ae 
0. TP DST SET; /* u32=base, argl=len, bitmap */ 
OTP SREPORT; /* (n)port list:mask 4 byte ea ay 
O: LPS PORT, /* (n)port list:mask 4 byte ea ay 
O- PROTO; /* argl=protocol oy, 
O MACADDR2, /* 2 mac addr:mask se 
O MAC TYPE, /* same as srcport Ay. 


3/ IPFW command from userland to the kernel 


Now, let's study, programmatically speaking, an ipfw command and its way to the kernel. 


/* EXPLANATIONS: Here are the table related command, we will be focusing 
on printing them */ 


> ipfw table all list 


iE. Co Mee Ser «1 Ill ey ee) ~ 1 
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If iC -subscremp (avr; “deleve”); == 0) 

ipfw_ delete (av); 

elseif: (<substrcmp (av, "i lush) == 00) 

ipfw flush(co.do force); 

else ar. substremp (*ay, "zero )) S00) 

ipfw_zero(ac, av, 0 /* IP _FW ZERO */); 

eles ik subst renp(* av, reset log"): ==: 0) 

ipfw_zero(ac, av, 1 /* IP FW RESETLOG */); 

/* Here print tables and its alias */ 

else if (_substremp(*av, "print") == | | 

_substremp(*av, "list") == 0) 
ipfw_list(ac, av, do_acct); 

else: if uw ssubstrcnp (*av,. show") == 0) 

ipiw list(ac, avj-Al.f* show counters. */).> 

else .af ( substrenp (avy, “cable” ): == 20) 

ipfw_ table handler(ac, av); 

elee. af. substremp (*avy, “ite rial): =>) :0) 

ipfw_ internal handler(ac, av); 

else 

errs (EX USAGE, "bad Command: "ss" ">. *av jy 


} 


void 


Ipiw. ast (1at ac, ‘char say |; ant show Counters) 
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iptw crg: lheader “cig; 
SErUCL .TORMal ODES SiO7 


SIZS cE iSZ; 


i* get. configuration from kernel +7 
erg = NUL 

sfo.show counters = show_counters; 
sfto.show time = co.do time; 


sfo.flags = IPFW CFG GET STATIC; 


if (coxdo. dynamic: t=" 0) 
sfo.flags |= IPFW_CFG GET STATES; 
if. ((sto.show counters: | ‘sto.show time): “!=0) 
sfo.flags |= IPFW CFG GET COUNTERS; 
if (ipfw_get_config(&co, &sfo, &cfg, &sz) != 0) 


err (EX OSERR, "retrieving config failed") ; 


Slab Le.” ae 

iptiw get. config (struct. cmdline opts. *co, Struct format. opts. *fo, 
Lpiwicig: Leader **pcig, size tk ““psize) 

{ 

ipiw-crg: [header *erg? 

SLZe Ck S25 


ae em 


25 


if (cosetese Only t=+0)) 4 
forintt (stderr, “Testang only, List dasabléed\n").; 


return (Q)> 


/* Start with some: data. size */ 
sz = 4096; 


cfg = NULL; 


for (i = 0; i < 16; itt) { 
if (cfg != NULL) 


free(cfg); 


if ((cfg = calloc(l, sz)) == NULL) 


return (ENOMEM) ; 


eig=>ilags = fo->iiags; 
Cig=2Start- rule = Toner yrs ts; 
cig=>end rule = ‘fo->lasty 


/* This is where the command is going to be send to the kernel with a raw 
memory estimate and try again by growing it if the data requires more room*/ 


if (do _get3(IP_FW_XGET, &cfg->opheader, &sz) != 0) { 
if (errno != ENOMEM) { 
free(cfg); 


return (errno); 
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/* Buffer size is not enough. Try to increase */ 
SZ = sz. * 2; 

if (sz. < -cig-Ssize) 

sz = cfg->size; 


continue; 


*ocfg = cfg; 
*psize = SZ; 


return (0)> 


free(cfg); 


return (ENOMEM) ; 


Here, the userland part ends and we're going to see what happens in the kernel: 


Stale smc 

dump contig (struct. ip fw chain *chain, ip fw3 opheader *“op3, 
Struck’ SOCkOpE data ,*Sd) 

{ 

ipfw cig lheader *hdr; 

SerUCT 1p tw *rule; 

Ssizé tC Sz, rnum; 


UimtS2- tt) hor Flags; 


2/7 


it error, a7 
struct dump args da; 


uint32 t *bmask; 


/* Our data lies contigously in raw form into the ipfw_cfg header struct 


Below, we 1l get the data we re interested in calculating the needed 
memory depending of the flags we passed earlier 


a 
hdr = (ipfw_cfg lheader *)ipfw_get_sopt_header(sd, sizeof (*hdr) ) ; 


// Depending on the flags you passed from the command line, various 
data are going to be displayed 


if (hdr->flags & IPFW_CFG GET STATIC) { 

for (12 = da.b? i <. da.ee i++) { 

rule = chain->map[i]; 

da.rsize- += RULEUSIZBL (rule). p> sigeor (ipiw-ob] tlw); 
da.Fcount++; 


da.tcount. += ipfw mark table kidx(chain, rule, bmask); 


/* Add counters if requested */ 
if (hdr->flags & IPFW_CFG GET COUNTERS) { 
da. cLSi7e-"t= Si 2e0T (STruce 1p tw. beounter) > daercount; 


da.rcounters = 1; 


it~ (ax CCoune. 2.0) 
Sf, +> Ga Leount * S11 7600 (1 piw.Ob 7 mG ly): 


sizeof (ipfw_ obj ctlv); 
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SZ. r= Ca rsi1ze: +-Sizeor (Apiw -oby-Chly), 


} 


if (hdr->flags & IPFW_CFG GET STATES) 


sz += ipfw_dyn_ get count () 


sizeof (ipiw obj .ctlv); 


static int 


dump Static rules (struct 1p fw chain. *chain, 


UInt3s2t *bmMask, struct sockopt data *sd) 


{ 

int error; 

Iai: aly. ks 

WInts2: © teount; 
Lpiw .oby -et.ly) Set lv; 
SCEUCE 2p. tw * kr le, 


caddr t. dst; 


1 = 0; 

tcount = -da->tcount; 
while (tcount > 0) { 
if ((bmask[i / 32] & 
ais ary 


continue; 


(1 << (1 3 32))) == 0) 
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* Si 2e0t (iprw ob] dyntly) 


struct dump args *da, 


if ((error = ipfw_objhash_ntlv(chain, i, sd)) 


return (error) ; 
eas 


tcount--; 


ae 

ipfiw export table ntlv (struct ap fw chain *chy, uintle-t kKidx, 
StLfUCL. SOCKOpL. data *sd) 

{ 

struct namedobj instance *ni; 

SEFUCE Named object “*no;y 

ipfw ob] “nely “tiv; 

ni = CHAIN TO NI(ch); 


no = ipfw_objhash lookup kidx(ni, kidx); 


KASSERT (no != NULL, ("invalid table kidx passed") ); 
ntlv = (ipfiw_ obj ntlv *)ipfw get sopt space(sd, sizeof (*ntlv)); 
if (ntlv == NULL) 


return (ENOMEM) ; 

ntlv->head.type = IPFW_TLV_TBL NAME; 
ntlv->head.length = sizeof (*ntlv); 
ntiv=ri1dx = no=->kidx; 


strlcpy(ntlv->name, no->name, sizeof (ntlv->name) ); 


return: (0). 
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'= 0) 


Now we have a better idea of how structured ipfw works. Basically, sysctl for boolean config values and 
via the socket for the firewall rules, tables settings and so on. Normally, some ideas might start to 
emerge for the next module. 


In the next and last module, we will have an overview of how it works on the kernel side with the firewall 
rules and configuration to see how to develop a new type of rules. 
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Exercises 


* To have an additional sysctl configuration point entry, explain which part of the kernel needs to be 
updated and how (what are the requirements’?). 


- We saw the list of available opcodes to configure or to get information from the kernel. If it is possible 
to add one, what are the requirements for the enum ipfw_opcodes values? What are the requirements 
for struct ipfw_insn (and derived) structs in term of alignment? 


- Considering a new feature to add to ipfw and in case a third party code is used, how should the work 
be shared between the userland and kernel side? 
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Module 3: Through 
The Userland to 
Kernel Codes 


In this module, we'll have an overview of ipfw2 - both userland and kernel side -, and how they interact. 


First of all, we will see how to use sysctl we saw in previous modules to set simple values. How to 
communicate settings to the kernel via a socket; all of it going through the userland to kernel codes. 


1/ IPFW command line settings via sysctl 


We can find it under the FreeBSD's source code we got from svn in the first module. 
<source code root path>/sbin/ipfw 


In the previous module, we saw that it was possible to enable the userland to interact with the kernel 
via a character-device. IPFW works differently. It enables / disables features via sysctl. If you have done 
some FreeBSD's programming and are already familiar with syscalls like sysctl, sysctlbyname, and 
sysctInametomib, you can jump directly to the next chapter. 


IPFW uses sysctlbyname and sysctl. There signatures are: 


int sysctibyname (const char *name, void *oldp, size t *oldlenp, const void 
*newp, size t newlen); 


int sysctl (Const int “name; U.Uint Mamelen, void *oldp; size t oldlenp, const 
void *newp, size t newlen); 


If you wish to get a value, the oldp and oldlenp arguments need to be used. To set a value, use the 
newp and newlen arguments. 
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For example, to get the number of CPUs available: 


Lire ToC ous 


Size t-nbecpulen = si zeorinbepw) 7 


if (sysctlbyname (“hw.ncpu”, &nbcpu, &nbcpulen, NULL, 0) == 0) 


Drintit”sd-cpuse\n”, nbcpu); 


Alternatively: 


Tie mip l2 <4 


mib[0] = CTL_Hw; 


mib[1] = HW NCPU; 


if (sysctl(mib, sizeof(mib), &nbcpu, &nbcpulen, NULL, 0) == 0) 


To set a value, like the number of maximum file descriptors: 


int maxfiles = 4096; 


S17 et maxfileslen = sizeof (maxfiles); 


if (sysctlbyname (“kern.maxfiles”, NULL, 0, &é&maxfiles, maxfileslen) == 0)... 


mib[0] = CTL KERN; 


mib[1] = KERN MAXFILES; 
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if (sysctl(mib, sizeof(mib), NULL, O, &maxfiles, maxfileslen) == 0)... 


IPFW uses the sysctl* family of functions to turn the firewall on/off and to make ipfw more verbose. 


Here is the part of the code part where the firewall is enabled or disabled in sbin/ipfw/ipfw2.c. This is 
called by the code in the main() function that parses user-supplied arguments. 


} Sise- ai ( Ssubsi rcp (say, "“Tirewal i"). ==). 4 
sysctlbyname ("net.inet.ip.fw.enable", NULL, 0, 
&éwhich, sizeof (which) ) ; 
sysctlbyname ("net.inet6.ip6.fw.enable", NULL, 0, 


&éwhich, sizeof (which) ) ; 


For more information, especially about all possible requests, check the sysctl man page: 


man 3 sysctl 


2/ IPFW command line settings via socket 


Here, we need to communicate our settings to the kernel side. 
ipfw uses a socket to add a rule via an identified optname/command. 


Here is a sample of code responsible for getting settings from the kernel from sbin/ipfw/tables.c: 


Soe ve. LM 
table do modify record(int cmd; ipfw obj] header *oh, 
Iprw roby -vTentry * Gene, ant counk, Tie alone) 


ipiw.obj ‘ctly *ctlv; 
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Lpiw sob]; tentry “tent base; 

caddr t pbhut; 

Char. 2bur lSsrzeor (™oh): rs. 2e0r (pry Ob) -Coly) +Ssa7e0n (Ptene il? 
Tite Crier “5 


SA Ze t- S27 


error = do_get3(cmd, &oh->opheader, &sz) ; 


int 
do get3(int- optname; ip two opheader *op3, -sSaze- Tt: *optlen) 
{ 


int error; 


LE (COSESST. Only) 


return (0); 


Le (piw Ss6Cket. S=/=1)) 


/* even though we could have used AF LOCAL here, we need to distinguish IPV4 
from IPV6 matters */ 


ipfw socket = socket (AF_INET, SOCK RAW, IPPROTO RAW); 


/* communication with ipfw2 command line here (via do cmd), the “get config” 


command will be coming from there */ 
ick: (pi w SOCker< 10) 


err (EX UNAVAILABLE, "socket"); 
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/* here, we send the equivalent programmatically speaking, of the command ; 
opcode which is the hexadecimal representation of */ 


op3->opcode = optname; 


error = getsockopt (ipfw_socket, IPPROTO IP, IP_FW3, op3, 


(socklen_t *)optlen) ; 


return (error); 


An piwo3s opheader structure: needs. tobe. passed. Here 2s (1ts raw derinitione 


typedef struct ipfw3 opheader { 
uintl6 t opcode; (Operation identifier) 
Wintlowt version; 


padding 


Some representative Ipfw3 commands are shown below. These are from 
sys/netinet/ip fw-h. 


#define IP FW TABLE XADD 86 /* add entry */ 

#define IP FW TABLE XDEL 87 /* delete. entry 4/ 

#define IP FW TABLE XGETSIZE 88 /* get table size (deprecated) */ 
#define IP FW TABLE XLIST89 /* list table contents */ 


#define IP FW TABLE XDESTROY 90 /* destroy table */ 











#define IP FW TABLES XLIST 92° f®* ist all tables */ 
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#define IP FW DUMP SOPTCODES 116 /* Dump available sopts/versions */ 


And here is the list of available opcodes, also from sys/netinet/ip_fw.h: 


enum ipfw opcodes { 














O NOP, 

O IP SRC, {* G32 SIP x / 

O IP SRC MASK, /* ip = IP/mask * / 

QO IP SRC ME, /* none A 

QO. TP SRC. SET, /* u32=base, argl=len, bitmap */ 

O IP DST, /* 32-S IP * / 

O IP DST MASK, /* ip = IP/mask Bp 

OTP DST Miy /* none Poy. 

O-IP DST SET, /* u32=base, argl=len, bitmap */ 

O: TP USRCPORT) /* (n)port list:mask 4 byte ea wf 
OLR: DSTPORT; /* (n)port list:mask 4 byte ea Af 
O PROTO, /* argl=protocol ay 

O MACADDR2, /* 2 mac addr:mask a 

O MAC TYPE, /* same as srcport a 
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3/ IPFW command from userland to the kernel 


Now, let's study how an ipfw command makes its way to the kernel from sbin/ipfw/ipfw2.c: 


> ipfw table all list 


LE (CO ews <ser:, i) ey next)? 4 

if i. substremp (tay, “dele le”), ==0)) 
ipfw_delete (av); 

else if “ substrenp(*ayv,-"ilush™). == 0) 
Lpiw tlush (covdo:. Torce) ¢ 

else it: ( SSsubstrenp<“avy, “2ero™ ji S=- 0) 
Lpiwi Zero (ac, av, 0-7 * Te FW AERO */) 7 

else. 2t ( substremp (tay, “Tress tlog')) S]=.'0) 
ipfw_zero(ac, av, 1 /* IP FW RESETLOG */); 


/* Here is an example we can go through the code flow from the table 
print/list command */ 


else if (_substrcemp(*av, "print") == | | 
_substremp(*av, "list") == 0) 
ipfw_list(ac, av, do acct); 

else: pt «( substrcmp (sav, “show"). == 0) 
ipfw list(ac, av, 1 /* show counters */); 

else: it. i Substrenp- (av, “tablen) == 0) 
ipfiw table handler(ac, av); 

else cit “ SupSstrcnp(*ayvy- “intermial) S=""0) 
ipfw internal handler(ac, av); 

else 


errx (EX USAGE, "bad command “%s'", *av); 
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void 
Lpiw: List (int .ac, char *av[]y, int show Counters) 
{ 

ipfw_cfg lheader *cfqg; 

SLEUCL format. Opts sito; 


SiZeck SZy 


/* get. configuration from kernel */ 
cfg = NULL; 

sfo.show_counters = show_counters; 
sfo.show time = co.do time; 


sfo.flags = IPFW CFG GET STATIC; 


if (co.do dynamic != Q) 
sfo.flags |= IPFW_CFG GET STATES; 

Lr. ((Sstowshow counters: ‘|| sto.show time). t=) 
StOvtlags |= TPEW CFG. GET COUNTERS; 


/* We get the general config from here */ 
if (ipfw_get_config(&co, &sfo, &cfg, &sz) != 0) 


err (EX OSERR, "retrieving config failed") ; 


static int 


ipiw get. config (struct.cmdline: opts. *co, struct. format. opts. *Io, 
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ipiwicrg header **pcrg,. size sty *psize) 


ipiw cig. [header *Seig; 
SLZS tL SZ7 


LT as 


it ‘(eo-Ftest Only TS. 0)~ 4 
fprintf(stderr, "Testing only, list disabled\n"); 


return (0); 


/* Start with somé.data-size */ 
sz = 4096; 


cfg = NULL; 


Por. Ca SOs. sis LSS Sey 4 
if (cfg != NULL) 
free(cfg); 
if ((cfg = calloc(l, sz)) == NULL) 


return (ENOMEM) ; 


Cro-Si legs = tos >ilags; 
ClO=-Slarl cule LO Priests 


Giger end. Pitbe =. o> hasty 


if (do get3(IP_FW_XGET, &cfg->opheader, &sz) != 0) { 
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1f (errno != ENOMEM) { 
free(cfg); 


return (errno); 


/* Buffer size is not enough. 


ag = s7.% 2 
if (sz < cfg->size) 
SZ :=CLgG=>size; 


continue; 


*peig = cig; 
*psize = sZ; 


rerurn. iO) 


free(cfg); 


return (ENOMEM) ; 


Try to increase */ 


Here, the userland part ends and we're going to see what happens in the kernel: 


Stak Te aa 
dump -contig( Struct. 1p tw ochain chain, 


Struck sockope data sd) 


ip fw3 opheader *op3, 
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ipfw cfg lheader *hdr; 
Struct ip. fw *rule; 
Sizer Sz). rnumy; 
uint32 t hdr flags; 
ITE CrROE; 21:3 

struct dump args da; 


ULMESZ, cl mask; 


hdr = (ipfw_cfg lheader *)ipfw_get_sopt_header(sd, sizeof (*hdr) ) ; 


// Depending on the flags you passed from the command line, various data 
are going to be displayed 


if (hdr->flags & IPFW_CFG GET STATIC) { 
Lor * Wa. = dan, by a. 6. awe at) 4 
rule = chain->map[i]; 
dairsize += RUGEUSIAZEL.(rule) + sizeot (apiw ob] tiv); 
ddivE COUN A; 
da.tcount += ipfw_mark table kidx(chain, rule, bmask); 
} 
/* Add counters if requested */ 
if (hdr->flags & IPFW_CFG GET COUNTERS) { 
da.¥Size += -Size0r (Struct 1p: iw Scounter) * da.rcount; 


da.rcounters = 1; 


if (aa beounk: — 0) 


s2-t= da. LCountl -* Sizeot(iptw Obj. ntly). + 
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sizeof (ipfw_obj ctlv); 


SZ, +=>.da.6S ize. Si 7zeot (iptwoby -cLlv);; 


if (hdr->flags & IPFW_CFG GET STATES) 
SZ t= “1 piw Gyn .geu Count (). * “SL 2so0r (iptw ob <dyntly): + 


sizeof (ipfiw obj ctlv); 


Slab: Ait 
dump: static rules( struct: ip iw chain *Chainh, -scruck dump args. *da; 


WiINtS2 Wb *pmMask). SErUCTSOCKOpE. Gata *5d) 


Live: errors 

Lit, ay. oly 

ULES 2: cE Counes 
Lpiw ob) etl *ctk ly; 
struct ip fw *krule; 


caddr t dst; 


1 = 0; 
tcount = da->tcount; 
while (tcount > 0) { 
if ((omask fa. f/ 32)) CL SS a eB 2)). BSOy 4 
Las 


continue; 
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ae 


if ((error = ipfw_export table ntlv(chain, i, sd)) != 0) 


return (error) ; 


Les 


tcount--; 


ipfw export table ntlv(struct ip fw chain *ch, uintlo t kidz, 


SEPUCE SOCKODE data- *sd) 


struct namedobj instance. *ni; 


struct named object *no; 


ipfiw obj] ntlv *ntlv; 


Nes. = CHAIN. TO: NL Cech); 


no ipfw_objhash lookup kidx(ni, kidx); 


KASSERT (no != NULL, ("invalid table kidx passed")); 


noly= \ipiw obj, ntlvy s)ipiw get Sopt.-space (sd, sigeef (*ntly)): 
if (ntlv == NULL) 


return (ENOMEM) ; 


ntlv->head.type = IPFW_ TLV _TBL NAME; 


45 


ntlv->head.length = sizeof (*ntlv); 
ntlv->idx = no->kidx; 


strlcpy(ntlv->name, no->name, sizeof (ntlv->name) ); 


return. (0); 


Now, we have a better idea of how ipfw works. Basically, sysctls are used for boolean config values 
and sockets for firewall rules, tables settings and so on. Normally, some idea might start to emerge in 
the next module. 


In the next and final module, we will have an overview of how it works in the kernel side, the firewall 
rules and configuration to see how to develop a new type of rules. 
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Exercises 


- To have an additional sysctl configuration point entry, explain which part of the kernel needs to be 
updated and how (what are the requirements)? 


- We saw the list of available opcodes to configure or to get information from the kernel. If it is possible 
to add one, what are the requirements for the enum ipfw_opcodes values? What are the requirements 
for struct ipfw_insn (and derived) structs in terms of alignment? 


- Considering a new feature to add to ipfw and in case a third party code is used, how is the work 
ought to be shared between the userland and kernel side? 
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Module 4: 
DUMMYNET Module 
Workflow Study 


In this last module, we'll not only look at ipfw’s communication with the kernel but also how the firewall 
configuration and rules are handled. 


We will go through the dummynet module, its workflow and how it operates with the kernel so you 
would be able to add new opcodes on your own. 


1/ DUMMYNET module study 


The dummynet (unlike the name suggests, it is not a kind of fake/no-op module, but the name is due to 
historical reasons as it was a test ensemble in the beginning) module allows setting network bandwidth 
limits (called traffic shaping), and is an optional submodule of ipfw. Since it is optional, it has to be 
enabled via the kernel configuration options DUMMYNET (into sys/conf/options). Beware there is the 
compait(iblity) layer for 32 bits (if needed), cloudabi (the secure posix interface layer) and probably for 
the Linux API compatibility layer that you might need to take care of when you develop a kernel 
module. 


Programmatically, there are four flags available to add a pipe (a pipe is to viewed as a workflow queue 
where every packet belonging to this same queue will be treated by the scheduler. (For more details 
into netpfil/ijpfw/dummynext.txt), deleting a pipe, flushing and getting the pipe info. 


#define IP DUMMYNET CONFIGURE 60 
#define IP DUMMYNET DEL 61 


#define IP DUMMYNET FLUSH 62 
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#define IP DUMMYNET GET 64 


=> The ipfw module ought to be loaded after ipfw. The DN MODEV_ORD in 
sys/netpfil/ipfw/ip dummynet.c ensures this: 


#define DN SI SUB SI_ SUB PROTO IFATTACHDOMAIN 


/* dummynet is guaranteed to start after ipfw, given that ipfw init phase 
occurs at (SI_ORDER_ANY - 255) giving enough room for modules as you can see 


=] 
#define DN _MODEV ORD (SI_ORDER_ANY - 128) /* after ipfw */ 
DECLARE MODULE (dummynet, dummynet_mod, DN SI SUB, DN MODEV ORD) ; 


MODULE DEPEND (dummynet, ipfw, 3, 3, 3); 





MODULE VERSION (dummynet, 3); 


Now, let's see the DUMMYNET's configuration part. First, copy the sockopt data from userland to 
kernel first. 


Then make the configuration in FreeBSD 7.x or FreeBSD 8.x format. For backward compatibility, the old 
FreeBSD 7 syntax is still supported (the new syntax is much less error prone/buggy, more consistent 


(pipe usage)). 


case IP DUMMYNET CONFIGURE: 
v = malloc(len, M TEMP, M WAITOK) ; 
error = sooptcopyin(sopt, v, len, len); 
1f (error) 
break; 
error = dn compat configure (v) ; 


Ereay,: Mie TEMP <5 
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break; 


static int 


dn compat.configure (void *v) 


{ 


struct. ‘dn. 1d. *burt = NULL, *base; 


struct dn sch *sch. =. NULL; 








struct dn link *p = NULL; 





Struct dn. fs:“tis. = NULL, 





STEUCE dn. PEoti le “pl. NULE 
int lmax; 


int error; 


Those two struct represent FreeBSD 7 and 8 pipe configuration format 
Seructe: dn pipes “py = (struct dn pape =) vy 


StRUcE dn. pipes “ps = (Struct dnpipes: <*)v; 


If we have a pipe to configure: 

1, = pi =P pilpe- nr; 

if (i t=O) { /# pipe contig. */ 

We take chunks of the buffer: 
Sch =O Next (Gbur, Ssizeot( sch); DN: Seay 
p= Oo. Next (burt, sizeot(*p), DN: LENK) ; 


fs => O-next (bur, Sizeor (*is), DN. &S)3 


error = dn compat contig pipe (sch; p;, fs; v)s 
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=> Here, we carry out traffic shaping configuration, bandwidth, 
burst.s 


Stacie: Lic 
dn Compat “contig “pape(Struce “dn sch. -*sch; struct dn link -*p, 


StEUCT i-ts “is, VOL +) 


SErUCE- dn pipes Spy S- (Struct -dn pipes: -*)/v3 
SEruct dm pipes “ps. = (struct. dn pipes *)'v; 


ine b=" pl=Fpipe nr; 


Sch->Ssched, nike = 4; 
sch->oid.subtype = 0; 

Dp =e lank mir = 7 

£S=PT Sa a 2 DN MAX EDs 


EeS> echoed) r= i+ DN MAX ID; 


/* Common. %0..:/ “and: Ce */ 
p->bandwidth = p7->bandwidth; 
p->delay = p/->delay; 
Le tbs yy 

/* FreeBSD 8 has burst */ 


p->burst = p8->burst; 


/* fill the fifo flowset */ 


dn compat config queue(fs, v); 
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delay and 


fess 2hs. nr = i+ 2*DN MAX ID; 


Pss>sched nr = i+ DN MAX ID; 


/* Move scheduler related parameter from fs to sch */ 
sch->buckets = fs->buckets; /*XXX*/ 
fs->buckets = 0; 
if (fs->flags & DN HAVE MASK) { 
sch=>flags |= DN HAVE MASK; 
fs->flags &= ~DN_HAVE MASK; 
sch->sched_ mask = fs->flow_mask; 


bzero(&fs->flow_ mask, sizeof(struct ipfw flow id)); 


return 0; 


=> Once the bandwidth is set (and eventually the extra burst allowance), 
these settings are used here. 


Lf (Sieridle. time < dn, cro <curr ‘Cime): 4 
/* Do this only on the first packet on an idle pipe */ 


struct dn. dank “p = éfs->sched=->link; 


Sa = sched “time. = di. Cioccurr “limes 
si->credit = dn_cfg.io fast ? p->bandwidth : 0; 
Le (p= burst):. 4 


Uinted “tt burst = (dii-cig «curr ‘time =— sis>idle time)  * p=>bandwidth; 
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it (burst: = p=Sburst) 
burst = p->burst; 


si->credit += burst; 


} 


=> Here, the delaying / bandwidth limit policies will be applied before 
giving back the upper hand 


to ipfw through dummynet send. 


mM = “serve sched (NULL, si, dn..cig. curr time) + 


Static: 7VOLd 
dummynet -send(struct.mbuf -*m) 
{ 


struct mbuf *n; 


for (; m != NULL; m=n) { 
struct ifnet *ifp = NULL; /* gee 3.4.6 complains */ 
SULUCE TM. slag: *tag; 


tite? CLS e 


=> IPFW actions 
Switch (dst) { 


Gase. DIR.OUT: 


ip output(m, NULL, NULL, IP FORWARDING, NULL, NULL); 


break ; 


53 


case DIR_IN 
netisr dispatch(NETISR IP, m); 


break; 


#ifdef INET6 
case DIR_IN | PROTO IPV6: 
netisr dispatch (NETISR. ITPV6, m) 7 


break; 


ease - DLR OUT |) “PROTO. RPV 6 


ip6 output (m, NULL, NULL, IPV6 FORWARDING, NULL, NULL, 
NULL) ; 


break; 


#endif 


case DIR_FWD | PROTO IFB: /* DN TO IFB FWD: */ 
TE. (bridge dnp. f= NULL) 
((*bridge dnp) tm, 2fp)) ¢ 
else 


printf ("dummynet: if bridge not loaded\n"); 


break; 


case DIR_IN | PROTO LAYER2: /* DN TO ETH DEMUX: */ 
/* 
* The Ethernet code assumes the Ethernet header is 


* contiguous in the first mbuf header. 
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* Ensure this is true. 

ee 

if (m->m_len < ETHER HDR LEN && 

(m = m_pullup(m, ETHER _HDR_LEN)) == NULL) { 
printf ("dummynet/ether: pullup failed, " 
"dropping packet\n"); 


break; 


ether, demux (m->m pkthdr.rcvift, m); 


break; 


case DIR OUT | PROTO LAYER2: /* N TO ETH OUT: */ 


ether output_frame(ifp, m); 


break; 


case DIK: DROP: 


/* drop the packet after some time */ 
FREE PKT (m) ; 


break; 


default: 


printt ("dummynet: bad switch td! \n", dst); 
FREE PKT (m) ; 


break; 
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2/ IPEW firewall rule route study 


Now, let’s see how a firewall rule is processed in the kernel. 


First, the rules parser sockopt copies from userland to kernel, and checks the rule’s validity. 


ae 
iIpiw Cr listruce sockope *Sopc) 
{ 
FoeEIne RULE MAXSIZE (S12*sazeciiy ancg2 tb) ) 
Lit Errer; 
SiZOU Si 2e,. vals2 ze; 
StruUCcE “1p Ew but; 
struct. ip fw ruled. *rule; 
Seruct.1p fwuchamm *ehnain; 
u_int32 t rulenum[2Z]; 
UIE SZ: Te Opis 
SeLUCE Tule CnSck into: Cay 


IPFW RLOCK TRACKER; 


chain &V_layer3 chain; 


error -= 0+ 


/* Save original valsize before it is altered via sooptcopyin() */ 


ValLesize = SOpt=>s0pt.valsize; 


Opt = SOpt->SoOpL. Name; 
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case IP FW ADD: 
rule = malloc(RULE MAXSIZE, M TEMP, M WAITOK) ; 


=> Here, we must check if it is FreeBSD 7.x rule format ; sopt _valsize 
field will give this hint after the rule is copied in the kernel. Then, it 
will be converted to FreeBSD 8.x format. 


error = sooptcopyin(sopt, rule, RULE MAXSIZE, 
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if (error == 0) 
error = check_ipfw_ruleO(rule, size, &ci); 
if (error == 0) { 
/* locking is done within add rule() */ 
STEUCE pe tw. @keuley; 
krule = ipfiw alloc rule(chain, RULEKSIZEO (rule)) +; 


G1, urule- = (caddr-t) rule; 


ci.krule = krule; 
import ruleOQ(&ci); 
error = commit rules (chain, €ci;, 1); 


=> If the userland requested an answer, it is converted back to FreeBSD 7.x 
format when necessary. 


if. ({error && -Sopt=>s0pt. dir == SOPT GET) 4 
Lie ed) 4 
error -= convert rule to. 1 (rule); 


size = RULESIZE7 (rule); 
Le error). 4 
free (rule, M TEMP); 


return error; 
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=> Sending back the rule data to the userland: 


error = sooptcopyout(sopt, rule, size); 


Afterwards, IPFW has main firewall rules to check where it is decided if a packet ought to be accepted, 
dropped, passed to dummynet module, and so on. 


int 
ipfw_chk(struct ip fw_args *args) 


{ 


/* Identify IP packets and fill up variables. */ 
=> Similar checking is done on ipv4. 

Lf (pktlen: 2= s1z2e0r (Struct? 1p6 hdr). Ge 
Largs =-eh. => NUL. |) Stype: => SIAR PE PPV 6). se PP = Op. 
Struct ip6 hdr *1po.= (slruck ap6. hdr *) ap; 
is ipv6é = 1; 
arGS="7h Vd2eddn ty pS: =) Gs 
hilen- = “SsiLz6é0r (SErUuct <ip6 hdr); 


PLOLO: =" 2p =r1 po mb; 


/* Search extension headers to find upper layer protocols */ 
while (ulp == NULL && offset == 0) { 


Switch (proto) { 
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case IPPROTO: ICMPV6: 
PULLUP TO(hlen, ulp, struct. icmp6 hdr) ; 
Lempo type = ICMP6(ulp)=>icmp6 type; 


break; 


Gase IPPROTO TCR: 
PULLUP: TO(hlen, wip, struct tcphdr); 


dst-porl. > TCR (ule) =>Chdport; 





sre pore = TCP (ulp)=>th sport; 
/* save flags for dynamic rules */ 
arge=20 10. flags <= TCP (lp). eth lags; 


break; 


case: IPPROTO SCTe: 
PULLUP. TOhLen, wip, Seruct. sclphdr)'7 


SEG POLL. = SCLP(a lp) ee Sre por; 





dst “por. = SCIP (utp) --2des.b Ore; 


break; 


case IPPROTO UDP: 
PULLUP TO(hlen, ulp, struct udphdr):; 


dstporl..= UDP(ulp)-=rulr dpork; 





Sre port. = UDP (aio) suh. sport, 


break; 
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=> Here comes an important part of your future custom module building, the 
check of the known rules where each packet is inspected from the following 
loop: 


fOr \y “Epos? -Chainarir TULes; ib postr), 4 
ipfw_insn *cmd; 
WintsZ & tablearg = Oy 
int 1, ¢mdlen; skip or; /* skip vest of “OR. block */ 


SUrucE 1p iw ei; 


f= -chain--map Lt pos]; 
at, AN SSE On Sab Le: ite Cl Se eset) :) 


continue; 


skip or = 0; 
for (Ch. = .t=>cmd lenny. cmd: =-£=>emd: 7-1). 0 -y 
1 -= cmdlen, cmd += cmdlen) { 


int match; 


/* 

* check body iS a jump target. used’ when we: find a 
* CHECK STATE, sand néed. to» jump: to the body “of 

* the target rule. 


mf 


/* check body: */ 
cmdlen = F_ LEN(cmd); 


Te 
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* Poy OR block: Casi. Ll] aye.) | - ssi): tes: relies 

* FOR bit set: an all. but the Vase instruction. 
x THe iret Match wid seu. “Skip. or"; and “cause 
* the following instructions to be skipped until 


= past the one- with The FCOR bat. .clear, 


xy 
Tf: Skip On} <4 j* Skip Cais AnSsiruction <7 
it (((ema=-len G.P -OR) == '0) 
skip or = 0; /* next one is good */ 
continue; 
} 
match = 0; /* set to 1 if we succeed */ 


Switch (cmd->opcode) { 
/* 
* The first set of opcodes compares the packet's 
* fields with some pattern, setting 'match' if a 
* match is found. At the end of the loop, there is 
* Logie to: deal with F NOT and © OR’ tilags associated 
* with the opcode. 
my 
Case NOP: 
match = 1; 


break; 


case O FORWARD MAC: 


printf ("ipfw: opcode *d unimplemented\n", 
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cmd->opcode) ; 


break; 


Case O° GLD: 

Case ULDs 

Gase- 0 JAIL: 
/* 
* We only check offset == 0 && proto != 0, 
* as this ensures that we have a 


* packet with the ports info. 

















xy 
if (offset != 0) 
break; 
LE, (proto: == £PPROTOUTCR. iil 
Prove. ==. LPPROFO UDP) 


match = check _uidgid ( 
(1pfiw_insn u32 *) cmd, 


args, é&ucred lookup, 


#ifdef $FreeBSD _ 


&ucred cache) ; 


#else 


(void *)Gcucred cache) ; 


#endif 


break; 


Case: O RECV: 


Match = thace. matchi{m=>m-pklhdr, revis, 
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(iptw insn if *) cmd, chain, «tablearg); 


break; 


Gase- 0) MAG. TV PRs 
if (args->eh != NULL) { 
DA MeLGMt “4p; = 
((ipiw-insn. a 16°.*) end) —-ports; 


In: a 


for (1 = cmdlen - 1; !match && 1i>0; 


match = (etype >= p[0] && 


etype <= p[l]); 


break; 


Case :O. RAG: 


match = (offset != 0); 
break; 
case O IN: fs Mou a8: mt. aa ay 
match = (oif == NULL); 
break; 


case: O LAYERZ: 


match = (args->eh != NULL); 
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break; 


Case O° DIVERTED: 
{ 
/* For diverted packets, args->rule.info 
* contains the divert port (in host format) 


* reason and direction. 


=f 
UINt Sec. te “1 =“angs=rrule. into; 
match = (i&IPFW_IS MASK) == IPFW_IS DIVERT && 


cmd->argl & ((i & IPFW_INFO IN) ? 1: 2); 


break; 


Gase O PROTO: 
/* 
* We do not allow an arg of O, so the 
* check of “proto” only sutfices. 
ad 
match = (proto == cmd->argl); 


break; 


Case (0 LP ORG: 
match = is ipv4 && 
(((ipfw_insn ip *)cmd)->addr.s addr == 
ere ip. Ss addT):; 


break; 
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Case © TP SRC LOOKUPSs 











Case QO IP DST LOOKUP: 
if (is _ipv4) { 
Wines 1. key s= 


(cmd=Fopcode. == 10 1P DST LOOKUP) - ° 


n 


dst. i1p<s. addr. * sre. ips. addr; 


ULnts2 Cov =O; 


if (cmd len.>- FE INSN SIZE (apiw “inem-usZ))/) 4 
/* generic lookup. The key must be 
* am 32bit big-endian format. 
“7 
Vo = CLI pie InSsm Use. Fjrcme)=Pa4 D5 
if (v == 0) 
key =-dst.ip.s- addr; 


else if (v == 1) 


key = Sree ip.s addr; 
else if (v == 6) /* dscp */ 
key = (ip sip tos >> .2)-'& OxSE; 
else if (offset != 0) 
break; 
else it Aproue- f=) TPPROTO TCP 366 
PLOre += TP PROLO: UDP) 
break; 


else if (v == 2) 


key =-dst por; 
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#ifndef USERSPACE 


#ifdeft FreeBSD _ 


else if (v == 3) 


key =" Src. POrL; 


Slee at ea ad, i area hay 
check _uidgid ( 
(ipfw insn u32 *) cmd, 


args, &ucred lookup, 


&ucred cache) ; 

if (v == 4 /* O UID */) 

key = ucred “cache—>er uid; 
else if (v == 5 /* O JAIL */) 


Key “= s1Cred <cCache=SCr- pri sonu= Spr. 1d; 


#telse /* ! FreeBSD */ 

(void *) &ucred_ cache) ; 

if (v ==4 /* O UID */) 

key = Cred cached; 

else if (v == 5 /* O JAIL */) 

key -= Uucred: Cache sxid> 
#endif /* ! FreeBSD */ 

} else 


#endif /* !USERSPACE */ 


break; 


match = ipfw_ lookup table(chain, 
cmd->argl, key, &v); 


1f (!match) 
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break; 


if (cndlen ==> bP INSN SL2E (ipiw insn usZ)-) 
match = 

CiLipiw insn msz *)emd) 2d LO) == ary 
else 


tablearg = v; 
pS LSS. aE. Cass py): ot 
WInte2 “bv. OF 
vold *pkey = (cmd=>opcode == © TP* DST HOOKUP)..2 
Gakbgs= >i 1d.dst Apes Cargs-Sf 1d. sre -1pGy 
match = ipfw_ lookup table extended(chain, 
emad-—Sargql, 


sizeof (struct in6 addr), 


pkey, &V); 
if (cmdlen == F_INSN SIZE(ipfw_insn_u32) ) 
Match. =" ((ipiw ansn. 32 “)cmd)y—>dil0) =="y7i 


1f (match) 


tablearg = v; 


break; 


Case 10. TP SKC MASK: 





case O_TP.DST MASK: 
if (is ipv4) { 
UIE SZ a = 


(cmd=>o6pcode == /0;. TP DST. MASK) 2 
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dst ips addr 2 src ipss addrs 
UINTS2Z C) * p= ((ipiw insn -us2 *)-emd)=>d; 


int i = cmdlen-l1; 


for (; !match && 1>0; i-= 2, pt= 2) 


match = (p[0] == (a & p[1])); 


break; 


Case (0 TP SRC ME: 
Le (is-ipva): 4 


struct ifnet *tif; 


INADDR ‘TO IFP (sre ip, iat re 
match = (tif != NULL); 


break; 


#ifdef INET6 


/* FALLTHROUGH */ 


Gase ©: LPG: SRC ME: 


match= 1S i1pv6. && search ip6 addr’ net (Gcargs->f id.sre ipo); 


#endif 


break; 


Case “Q..1P- DSL _SEr; 





case: O IP “SRC. SET: 


if (is _ipv4) { 
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Wants 2 se Ad = Oh anes 2 26 * )otemer ky; 
W Ame S2 addr 
cmd=sopcode: == © 2P: DST: Shr? 
args=>f-i0.dst Jip 


akg Ss >i 2c See) ap? 


if (addr < d[0]) 

break; 
addr -= d[0]; /* subtract base */ 
match = (addr < cmd->argl) && 
Ca Eh ot adaies > 54° ]° xg 


(1<<(addr & Oxlf)) ); 


break; 


Gass Oi Pus Ts 
match = is ipv4 && 
(((ipfw_insn ip *)cmd)->addr.s addr == 
dst ip~s.addr) ; 


break; 


Gase- Oy 1P Dol ME: 
if (is _ipv4) { 


struct ifnet *tif; 


TINADDR TO ITFP\dst. ipy bit) 4 


match = (tif != NULL); 
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break; 


#ifdef INET6 


/* FALLTHROUGH */ 


Case 0 -1P6. DST ME; 


match= 1S ipv6 && search ip6 addr net (sargs->f id.dst ip6); 


#endif 


break; 


Case’ 0. TPe SRCPORT: 








Case, Ou TPDSTPORT: 
/x 
* offset == 0 && proto != 0 is enough 
* to guarantee that we have a 
* packet with port info. 
al 
if ((proto==IPPROTO UDP || proto==IPPROTO TCP) 
&& offset == 0) { 
ia SG. Et Se 
(emad=>opcode: == "0 TP: SRCPORT): 2 
SEC “POEL * ‘dst pOrt F 
i Se eG. Ie cep 
(Crpiweinsn: 16. *)-cmd) —>ports; 


ae “ans 


for (1 = cmdlen - 1; !match && 1i>0; 
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match = (x>=p[0] && x<=p[1]); 


break; 


Case 0. LOMPITEs: 
match =- (offset -== 0..&6°proto==ITPPROTO: ICMP. && 
icmptype match(ICMP(ulp), (ipfiw insn u32 *) cmd) |); 


break; 


Case. ©. LOG: 
ipfw_log(chain, f£, hlen, args; m, 
Olt, offset || 1pef mf, tablearg; ip); 
match = 1; 


break; 


Case QO ANTISPOOF: 
/* Outgoing packets automatically pass/match */ 
if (oif == NULL && hlen > 0 && 
( (is ipv4 && in localaddr(src_ ip)) 
#ifdef INET6 
|| (i1s_ipvo && 
ing localaddr(é(args=-i 1d.sre. 1p6)))) 


#endif 
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#ifdef INET6 
is ipv6 ? verify patho ( 
Sarge =ST1d.SEne 1po)., 
m->m pkthdr.revit, 
args=>T. 1d.£1b) 
#endif 
VErILy path (sre. ip, 
Mast pene. Reva, 
args“ 1debib) 
else 
match = 1; 


break; 


Case Os TPSEC: 
#ifdef IPSEC 
match = (m_tag find(m, 
PACKET TAG IPSEC_IN DONE, NULL) != NULL); 
#endif 
/* otherwise no match */ 


break; 


#ifdef INET6 
Case “O. TP SRC: 
match = is ipv6 && 


IN6 ARE ADDR EQUAL (&args->f id.srce_ ip6, 
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&((ipfw_insn ipo *)cmd)->addr6) ; 


break; 


Case 0 LPG. DST: 
match = is ipv6é && 
IN6 ARE ADDR EQUAL (&args->f_id.dst_ip6, 
&((ipfw_insn ipo *)cmd)->addré6) ; 
break; 


Case © TP6 SRC MASK: 





Case ©. 1P6 DST MASK: 
ie (ars. pve). 4 
Lie: ae =ome ler). = 21g 
SUruUct im6 addr: pF 
Struct. 1n6. addr *d:'= 


&((ipiw ainsn..1p6 -*) ¢emd) —paddro; 


for ( tmaten&G: a. 2 0% ds te 2; 


i SSF NSN slats cruce. a6-addr) 
eet 
p = (cmd->opcode == 


O IP6 SRC MASK) ? 
aeGs=sb TdLSEC “pet 
ergs=-2r 10.0 st; 1 pe; 

APPLY MASK(&p, &d[1]); 

match = 
IN6 ARE ADDR EQUAL (&d[0], 


&P) ; 


73 


break; 


case O FLOWOID: 
match = is ipvé && 
flowoid match (args=>f.id.flow_ide, 
(Iptiw. insn u32 *) emd); 


break; 


Case’, ©. EAT ADR: 
match = is ipv6é && 
(ext_hd & ((ipfiw insn *) ecmd)->argl); 


break; 


Case Oo IPG: 
Maten. =a: 1pVve; 
break; 


#endif 


Case 124: 
match = 2s ipv4; 


break; 


case ©. TAGS 
SLFuct-m tag  *mtag; 


WINtSZ 6 ‘tag = DARG (emd=>argl, tag)? 
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/* Packet is already tagged with this tag? */ 


mtag = m tag locate(m, MTAG IPFW, tag, NULL); 


/* We have “untag' action when F NOT flag is 
* present. And we must remove this mtag from 
* mbuf and reset “match' to zero (°'match' will 
* be inversed later). 

* Otherwise, we should allocate new mtag and 
* DuSh 1. 2ntoe mbuL. 
ay 
re (emd->len: € “ER. NOT) t f*: “untag’ action. */ 
if (mtag != NULL) 
m tag delete(m, mtag); 
match = 0; 
} else { 
if (mtag == NULL) { 
mtag = m_tag alloc( MTAG IPFW, 
tag, 0, M NOWAIT); 
if (mtag != NULL) 


m tag prepend(m, mtag); 


II 
FR 
“ee 


match 


break; 
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Gase. ©: TAGGEDS 4 
struct. ™, tag *miag; 


UINES2 th hag-=) TARG (cmd=Farg.;. tag) 


if (cmdlen == 1) { 
match = m tag locate(m, MTAG IPFW, 
tag, NULL) != NULL; 


break; 


/* we have ranges */ 
for (meg =m tag: first (m)-5 
mtag != NULL && !match; 
mtag = m_tag next(m, mtag)) { 


UST Ces te. pe 


Ta ale 2 

Lk, AMbag=sm tag: cookie t= MlAG~ 1 PEW) 
continue; 

p = Ulipiw  insnuL6. *) end) “sports; 

1 = cmdlen - 1; 

for(; !match && 1. > 07 1--, p += 2) 
match = 


Mbag=>m- tag 1d >= pl0])) 4 
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mtag->m_ tag id <= p[1]; 


break; 


Cace “DO LIMLTs 


Case 'O-ACCEPIS 


retval = 0; /* accept */ 

Ae se 3 /* exit inner loop */ 
done. = A¢ /* exit outer loop */ 
break; 


Gase 0. PIPE: 
Case O-QUBUE: 
set match(args, £ pos, chain); 


TARG(cmd->argl, pipe); 


args->rule.info 
LE: CCMa=sopcede == OUP LPH) 
args=erulesinfo: |= ITPEW IS PLPE; 
if (V_fw_one pass) 
args->rule.info |= IPFW ONEPASS; 


retval = TP PW. DUMMYNET; 


HY ees Ges /* exit inner loop */ 
done = 1; /* exit outer loop */ 
break; 
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case O DIVERT: 
Case O- TEE: 
if (args->eh) /* not on layer 2 */ 
break; 


/* otherwise, this is terminal */ 


i, =O /* exit inner loop */ 
done: 1¢ /* exit outer loop */ 
retval = ..(emd=>opcode == 0: DIVERT)  °? 


IP FW DIVERT : IP_FW TEE; 
set match(args, £ pos, chain); 
args->rule.info = TARG(cmd->argl, divert); 


break; 


Case. REJECT: 
/* 
* Drop the packet and send a reject notice 
* if the packet is not ICMP (or is an ICMP 


* query), and it is not multicast/broadcast. 


aa 
if ({hlen > 0 && 18-apv4. Go ofiset == 0.66 
(proto != TPPROTO. TEMP ||| 
is icmp query(ICMP(ulp))) && 


!(m->m_flags & (M_BCAST|M MCAST)) && 
LIN MULTICAST (ntohl (dst ip .s.addr))) 4 


send. reyect.(args;,. cmd=-argl, 1pleny, 1p)? 
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m = args->m; 


/* FALLTHROUGH */ 


case O DENY: 


retval = IP FW DENY; 


1, =O /* 87 inner. Loop. *7/ 
done] 1¢ /* exit outer loop */ 
break; 


Case’ O- FORWARD! IP: 


if (args->eh) /* not valid on layer2 pkts. */ 
break; 

i, 1G) = Nia: ||| Qe Pat. "| 
dyn dir == MATCH FORWARD) { 


struct sockaddr in “sa; 


sa = €(((iptw insn sa “*)-cmd)->sa); 

if (Sa->sin addr.s addr == INADDR ANY). { 
#tifdef INET6 

/* 


* We use O FORWARD IP opcode for 

* fwd rule with tablearg, but tables 
* now support IPv6 addresses. And 

* when we are inspecting IPv6 packet, 
* we can use nho field from 


& Cable: value as. ext. hope: address. 
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. 
LE (is-apve) 4 


StrUCE sockaddr: in6 *sa6¢ 


sae =-args-Snext hops. = 
&args->hopstore6; 
sao->sino family = AF INET6; 
sao=>sine len. = si7eor- (“sao 
Sao->einoe addr = TARG. VAL ( 
chain, tablearg, nh6); 
/* 
* woe, San6- Scope, 10 cmly Lor 
* link-local unicast addresses. 
Pi, 
if (IN6 IS ADDR _LINKLOCAL ( 
&Sa6->sino addr) ) 
Sa0=7 Sime. Scope. 1d. = 
TARG VAL (chain, 
tablearg, 
zoneld) ; 
} else 


#endif 


Sa. = args--nexst hop. = 
&args->hopstore; 
Sa->sin. family = Ar INET; 


Sasa ben: = S40 4608 (* Sa)., 
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sa->sin addr.s- addr = htont ( 
TARG VAL(chain, tablearg, 


mia) hey 


} else { 


args=>nexL AOp —.sdaz 


} 


retval = IP FW PASS; 


hes Gs /* exit inner Loop. */ 
dene: =" 1; /* exit outer loop */ 
break; 


Case. NAL: 
a =, Oe /* exit inner loop: */ 
done: = 1- /* exit outer loop */ 


if (!IPFW_NAT LOADED) { 





retval = IP FW DENY; 


break; 


SLEUG LS -Clg (Nak: iy 


Lhe hae 2d 


set match(args, £ pos; chain) ; 
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/* Check if this is 'global' nat rule */ 
if (ema—>arg l=) 0) 4 


retval = ipfiw nat ptr(args, NULL, m); 


break; 
} 
t = ((ipiw insn nat *) cmd) ->nat; 
if (t == NULL) { 


nat_id = TARG(cmd->argl, nat); 


t = (*Llookup: nat ptr) (échain->nat, nat id); 


ft it. == NUE) <4 


retval = IP FW DENY; 


break; 
} 
tf (omd=hargh t= TP PWeTARG) 
((ipfw insn. nat *)cmd)->nat = t; 
} 
retval = ipfw nat ptr(args, t, m); 
break; 
=> We could add additional new type of rules. as we've seen in the 


previous modules, new opcode are needed. 
default: 
panic("-- unknown opcode %d\n", cmd->opcode) ; 


Lt f* end of Switch () on opcodes. */ 
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Final exercise 


Now is the time to use all the acquired knowledge you learned through all the modules. 


The goal is to add firewall policy (for example, based on IP / country codes, adding the detection 
algorithm in the kernel side then the rule config from userland perspective), updating the ipfw command 
line accordingly and above all, the kernel side (preferably as a distinct ipfw kernel module). 
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